Dictionary Selection Using Partial Matching

نویسندگان

  • Dzung T. Hoang
  • Philip M. Long
  • Jeffrey Scott Vitter
چکیده

This work concerns the search for text compressors that compress better than existing dictionary coders, but run faster than statistical coders. We describe a new method for text compression using multiple dictionaries, one for each context of preceeding characters, where the contexts have varying lengths. The context to be used is determined using an escape mechanism similar to that of prediction by partial matching (PPM) methods. We describe modi®cations of three popular dictionary coders along these lines and experiments evaluating their e€ectiveness using the text ®les in the Calgary corpus. Our results suggest that modifying LZ77, LZFG, and LZW along these lines yields improvements in compression of about 3%, 6%, and 15%, respectively. Ó 1999 Elsevier Science Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DICTIONARY-BASED CODE COMPRESSION TECHNIQUES USING BIT-MASKS FOR EMBEDDED SYSTEMS By SEOK-WON SEONG A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE

of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science DICTIONARY-BASED CODE COMPRESSION TECHNIQUES USING BIT-MASKS FOR EMBEDDED SYSTEMS By Seok-Won Seong May 2006 Chair: Prabhat Mishra Major Department: Computer and Information Science and Engineering Memory is one of the most restricted resource in...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Database Schema Matching Using Machine Learning with Feature Selection

Schema matching, the problem of finding mappings between the attributes of two semantically related database schemas, is an important aspect of many database applications such as schema integration, data warehousing, and electronic commerce. Unfortunately, schema matching remains largely a manual, labor-intensive process. Furthermore, the effort required is typically linear in the number of sch...

متن کامل

Sinusoidal modeling of audio and speech using psychoacoustic-adaptive matching pursuits

In this paper, we propose a segment-based matching pursuit algorithm where the psychoacoustical properties of the human auditory system are taken into account. Rather than scaling the dictionary elements according to auditory perception, we define a psychoacoustic-adaptive norm on the signal space which can be used for assigning the dictionary elements to the individual segments in a rate-disto...

متن کامل

Term Selection Term Selection Query - language Term Translation Doc - language Term Selection Term Weighting Term Matching Term Weighting Term Matching

This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionary-based query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required description-only queries and that use of the rst translation in the edict dictionary is comparable w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 119  شماره 

صفحات  -

تاریخ انتشار 1999